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For alloy thermodynamics, we obtain unique, physical effective cluster interactions (ECI) from 
truncated cluster expansions (CE) via subspace-projection from a complete configurational Hilbert 
space; structures form a (sub)space spanned by a locally complete set of cluster functions. Subspace- 
projection is extended using Fractional Factorial Design with subspace "augmentation" to remove 
systematically the ECI linear dependencies due to excluded cluster functions - controlling conver- 
gence and bias error, with a dramatic reduction in the number of structural energies needed. No 
statistical fitting is required. We illustrate the formalism for a simple Hamiltonian and Ag-Au alloys 
using density-functional theory. 

PACS numbers: 64.60.De, 64.60.Cn, 75.10.Hk, 02.30.Mv 



I. INTRODUCTION 

A cluster expansiorP (CE) has proved to be an invalu- 
able multi-scaling technique for generating cluster-based 
Hamiltonians that allow large numbers of configurational 
energies to be calculated efficiently via a small set of den- 
sity functional theorjEEl (DFT) calculations. Hence, the 
CE Hamiltonian is well suited for modeling alloy thermo- 
dynamics an d pha se diagram^^^^l a,nd perform ground- 
state searchea^SEi] over a large number of configurations 
on a fixed lattice. Although the CE is an exact basis 
set expansion in terms of cluster correlation functions,^ 
whose coefficients (a priori unknown) are effective clus- 
ter interactions (ECI), it is infeasible to determine all 
ECI for a P-componcnt alloy on a N-site lattice (where 
N is large) as it would require the computation of all 
structural energies via expensive DFT calculations, de- 
feating the purpose of CE as a multi-scaling tool. For 
example, there are over 4 billion possible configurations 
for a a modest N=32 atom cubic cell of a (FCC) binary 
alloy (P=2). 

Instead, a truncated CE (trCE) is constructed 
from a t rain ing set of M^P^ energies via structural 
inversion.SEsI xhe truncation is, however, not unique - 
but there is only one true set. To minimize the mean- 
squared error associated with trCE, one has to balance 
the variance (data's numerical noise) and bias (inaccu- 
rate model for the estimator). Conventionally, ECI are 
treated as fitting parameters to obtain a trCE "best fit" 
to known DFT structural energies (assumed random nu- 
merical noise). To prevent over or under fitting, a 'pre- 
dictive' measure (e.g., leave-out-one cross-validation CVi 
erroil^ is used to select a trCE, with emphasis on bal- 
ancing errors from truncation and variance (data noise). 
However, well-converged DFT energies should be vir- 
tually free of random noise.'^ Also, for large learning 
sets, model selection via minimizing CVi could result in 



overfittingji^EI! trCE with CVi below the data's noise 
level should not be selected.'^ Recent efforts to improve 
the predictive cap ability of trCE analyze only errors aris- 
ing from variancei ^^ l ^^ l Little has been done to address 
how bias impacts the predictive capability of trCE. 

Here, with DFT structural energies assumed noiseless, 
we show that the only sources of error are the ECI of 
cluster functions excluded from the trCE set (the bias) 
and that the choice of structures in the set dictates the 
way errors are distributed. Thus, the cluster functions 
included in the trCE can be linearly dependent on ex- 
cluded cluster functions, affecting convergence and error. 
We show that bias is reduced when physically important 
clusters are included in trCE. 

The CE of a binary alloy (with complete cluster basis 
functions) is the Walsh-Hadamard transformation.'iSMI 
Comb ining this with concepts from fractional factorial 
^ggjg]ji912ijt23l (PPJ3) show that linear dependencies 
between ECI can be deduced geometrically, if the M 
structures used in structural inversion are from a locally 
complete Hilbert subspace. By prescribing a large su- 
percell as the complete configuration Hilbert space, we 
can detail each subspace and identify linear dependen- 
cies between ECI. In this subspace-projection, structures 
form a (sub)space spanned by a locally complete set of 
cluster functions^^ uniquely determined using a physical 
hierarchy.^^ Errors in trCE are eliminated when the sub- 
space of known structures is large enough such that all 
physically significant ECI are included. What remains 
is the size of the subspace required and how one resolves 
critical ECI linear dependencies. To answer these, we use 

1. FED concepts to construct complete (sub) spaces 
and identify linear dependencies between the ex- 
cluded cluster functions and the truncated set. 

2. Cluster hierarchjl^ni^ established by the moment 
theorem ,'^SH2l! to ensure the choice of key physical 
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ECI, yielding a locally complete CE that is unique. 

In addition, we elucidate the physical/mathematical 
meaning of ECI, which have at times been overlooked by 
treating the ECI only as fitting parameters, e.g., genetic 
algorithms searches,'^ resulting in non-unique (often un- 
physical) clusters sets with similar CV scores. 

We first review the CE formalism and its relation- 
ship to Hadamard matricetP^ used i n signal processing^- 
and fractional factorial desigrP^l^HUll (^gr design of ex- 
periment), where the issues faced are similar to those in 
the CE. Our subspace-projection CE formalism is illus- 
trated first by a simple model Hamiltonian, and then by 
a detailed application to Ag-Au using DFT, where a sub- 
space of modest size (4 times fewer energies vs. current 
methods) yields a trCE with good predictive capability 
without statistical fitting. The predictive capability of 
the trCE is validated with extra DFT structural ener- 
gies and the ECI reflect the physical hierarchy used in 
subspace-projection. We note that statistical validation 
is applied in our methodology even though no statistical 
fitting is required. 



Alternatively, ([T]) may be re-expressed in full to include 
all possible configurations and correlation functions. 



E= $i,$2,-,$2 



(3) 



E is a 2^-coniponent vector with each component being 
the energy of one of the 2^ possible alloy structures, cr; 
$^ are 2'^-component vectors, with each row being the 
correlation functions of structure a . Each cluster set r\ 
is labelled by an integer from 1 to 2^. The set of {<&,,} 
forms an orthogonal array and obeys the orthogonality 
condition, 



Tr(N)$,(a)$„,(<7)-5, 



(4) 



where the trace, Tr^'^-' = V ... , is over all 2^ con- 
figurations; this is essentially a dot-product between two 
correlation function vectors. 

If every E((t) in E is evaluated (e.g., via DFT calcula- 
tions), the ECI are simply obtained from Eq. ([s]) via a 
matrix inversion 



J = *"^E 



II. CLUSTER EXPANSION OVERVIEW 

The CE is a basis-set expansion of alloy properties 
in terms of cluster entities, giving a formal and exact 
representatiorP when all clusters are included; in its 
most general form, the CE is applicable to any multi- 
component alloy on any fixed lattice. Here we use orthog- 
onal cluster functions constructed from spin variables. 

Labeling the sites on an N-site lattice with integers 
{1,2,...,N}, the vector g = {(Ti, CT2, ctn} is used to 
describe a given structure (or configuration) , where ct^ = 
1(— 1) if site i is occupied by an A(B) atom in an A-B 
alloy. Expanded in terms of cluster functions, the energy 
of an alloy structure is expressed as 



E(a) 



J^$^((f) 



(1) 



where 77 = {zi, 12, in} is a set of integers that denotes 
the sites selected to form an rt-site cluster (n < N), with 
ik & {1,2,...,N}. The summation is over all 2^ clusters 
possible within the N-site lattice, including ryo = 
which gives a constant term, J^, independent of a. The 
Jrj coefficients are called the effective cluster interactions 
(ECI). The cluster functions, $, constructed from Cheby- 
shev polynomials, are defined as 



CTi, (Ji. 



(2) 



with $p=l, and form an orthogonal basis set spanning 
the 2^ configuration space. For example, ${i}((T) = Oi 
is the single-site function at site i and ${i^j}((T) = GiUj 
is the pair function for sites i and j', for a given a. Note 
that, except for $0, $^((t)=1 or —1. 



(5) 



However, first-principle calculations are computationally 
costly, making it impossible to evaluate all YP^"^ [a) for 
even a modest value of N (N=32 gives ~ 4 billion con- 
figurations); thus, in practice, only a small fraction (typ- 
ically between 30 to 100) are evaluated and used to con- 
struct a CE for an alloy system, and through structural 
inversion,'^ only a subset of ECI can then be determined. 
Therefore, two critical choices have to be made ~ (1) the 
subset of E^^"^ for structural inversion and (2) the sub- 
set of ECI to be determined, which should not be left to 
guesswork. 



A. Error Analysis of Structural Inversion 

In the standard model for least-squares fitting, the ob- 
served values, f , is related to the values of the true model, 

E, by 



£ = E 



(6) 



where e is a randomly distributed error with zero mean 
and variance . This implies that 



=E+(e) =E 



(7) 



where < ... > denotes expectation values averaged over 
all possible observations having the same atomic config- 
uration, a . For us, the random noise in DFT data may 
arise from various computational setting (e.g., different 
energy-cutoff, k-points, convergent criteria). < f (ct) > is 
the expected energy of configuration a averaged over all 
such computational settings. 
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For a given (T, the mean squared error (MSB) of its 
estimator, E((t), may be decomposed into a variance and 
biasf^see Appendix [a] 



MSB 



E(a) 
Var + Bias 



B(a) 



E(<7))-B(a) 



(8) 



B(tT) is the true value and < E((t) > is the estimator 
constructed from the expected observations. 

To minimize the MSB, one has to balance the variance 
(from data noise) and bias (inaccurate model for the es- 
timator). For a fixed learning set, a trCE that includes 
too few clusters gives a large bias, although the vari- 
ance maybe small (under- fitting) , while too many clus- 
ters leads to over- fitting (large variance). Both under 
and over fitting lead to a large MSB and thus to a large 
prediction error. To balance the variance and bias, most 
CB practitioners use CVi , with issueJ^^ discussed in the 
introduction. While the variance term is reduced with 
well-converged DFT energies and/or using more DFT 
energies (data points) in the fit, we show that the bias 
term is reduced when physically important clusters are 
included in the trCB. 

We now re-write Bq. (|3| by dividing J into two sub- 
vectors Ji and J2 of length M and respectively, 
with Ji to be determined via structural inversion (SI), 
leaving out J2. 



(9) 



where 4>ii is a L-by-M matrix with 2^ > L > M. The 
variance term is then given by 



"El ■ 




011 


012 




■ Jl" 


. E2 . 




. 021 


022 . 




.J2. 



Var = 



0n0ii 



(10) 



where (f>'^i is a row vector of cluster functions in 
(where R=l or 2) corresponding to configuration a and 
{(pii<Pii)^^ is the covariance matrix. Although the vari- 
ance of the data noise is fixed at and beyond one's 
control, the variance term may be reduced by including 
specific configurations that will reduce < A >ff (wh ere < 
... >5> is an average over a large set of configurations) .C^liSI 
As for the bias term. 



Bias = (^(E{a) 



B(<7) 



(11) 



Jl is the estimator of Ji and is obtained via SI using a 
least-squares method 



Jl = (0ii0ii) 0iiEi 



(12) 



provided that cpucfn is invertible. Detailed derivations 
for the variance and bias terms are in Appendix [A] 

The choice of Ei already precludes certain combina- 
tions of Jl that would render cpj^ip^j^ singular. Notably, 
under the least-squares method, the estimator for J2 is 

always zero, i.e., J2 = 0. Unless J2 is truly zero, Ji is a 
biased estimator; that is, 



Ji^Ji 



0n0ii 



0n0i2 



J2 = Ji+SJi , (13) 



derived by substituting Ei = (puJi + (P12J2 from ^ 
into (12). The mean estimator of the known structural 



energies is thus 



011 ( 0n0ii) (011012 ) - 012 



El) =0iiJi 

= Ei + 

= El + SEi . 
Likewise, for structural energies not used for SI, 

,132) = 4>2lJl 

021 (0n0ii) (011012) - 



J2 
(14) 



B, 



B2 + SE2 



(15) 



Our goal is then to minimize the bias term over all struc- 
tures, i.e., < Bias >s, given by 



< Bias >ff- 



mi? 



^2? 

(2^-i) ' 



(16) 



which will be achieved if J2=0, i.e., the true values of 
the excluded interactions are zero. We stress that mini- 
mizing |(5Bip/L alone (i.e., least-squares fitting) will not 
minimize < Bias >g in general. In this case, a full rank 
invertible matrix would result in |(5Bip/L=0. How- 
ever, unless J2=0, errors in E2 still remain 



(5E2 = [02101lVl2 - 022] J2 



(17) 



Thus, structures from E2 are needed for validation. 

We thus see that the only source of error for the 
bias term comes from J2. Accepting that we have well- 
converged DFT energies, such that £ in ([6]) is precise and 
noiseless, one only needs to minimize < Bias >ff to ob- 
tain a reliable trCE. We showcase an appr oach based 
on fractional factorial design of experimentllSlIlHIll 
identify linearly dependent ECI and via a hierarchical 
approach, add physically important ECI to construct 
a unique trCB. In doing so, the number of physically 
important ECI in J2 decreases and one approaches the 
unique CB. 

We first show that errors are incurred when Ji is eval- 
uated with only a fraction of known "experimental" data 
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(El). These concepts provide a specific method to se- 
lect the structural energies for Ei such that (pn remains 
a Hadamard matrix and it shows clearly how J2 is the 
source of error for (5Ei, SF12 and SJi. 

III. RELATION TO HADAMARD MATRICES 

When {%}in Eq. ^ are arranged in a certain lexico- 
graphical order, $ becomes the Hadamard matrix, com- 
monplace in factorial desig of experiments and signal 
processings!. Several classes of Hadamard matrices ex- 
ist, of which the Sylvester- typ^^^l of size 2^-by-2'^ are of 
direct relevance to the CE. Starting from a single lattice 
site labelled as 1, 



^{1} = 



$0,*{1} 



(18) 



with the configuration space fully spanned by the 2- 
component vectors $0 and ${!}• With two lattice sites. 



'^{1,2} ="^{1} «''H{2} 



1111 

1-1 1-1 
1 1-1-1 
1-1-1 1 



*0,«'{l},${2},*{1.2} 



n{2} ^{2} 

"^{2} -^{2} 



(19) 



The four possible configurations are given by [${1} , '&{2}]; 
e.g., the second row corresponds to a structure with 
atomic type —1 and 1 on sites 1 and 2, respectively. For 
a general N-site lattice the Hadamard matrix is 

H{1,...,N} =H{1}®H{2}® -^^iN} , (20) 

which satisfies the property, 

^{l,...,N}'^{l,...,N} = 2^l2N , (21) 

where I2N is the 2'^-by-2'^ identity matrix. In addition, 
the columns and rows of 'H{i.. .^n} are the Walsh func- 
tions, commonly used in spectral analysis of rectangular 
waveforms j'^Sl hence, in a complete 2^ vector space, 

* = ^{i.....N} . (22) 

Equations ^ and ^ are the Hadamard- Walsh transfor- 
mation and its inverse, respectively, with the ECI being 
Walsh coefficients. 



IV. FACTORIAL DESIGN AND ECI OF 
ISOLATED CELLS 



for illustration and Eq. (19), the full factorial design is 
given by 

[Eii,Eii,Eii,Eii]^ ^'H{i^2}[Jo,Ji,J2,Ji,2]^ , (23) 

where the subscripts of E denote the combination of a 
(c.f. Eq. (19)) while those of J label atomic sites, i.e., 
in = 1 or 2. Ji and J2 are single-site interactions of 
site 1 and 2, respectively, while J12 is the 2-body (pair) 
interaction between sites 1 and 2. We emphasize that 
such a formalism is identical to a CE of an isolated cell 
with no periodic boundary conditions (see Fig. [T|). 

In the nomenclature of factorial design ,1^21211121; e is 
called the full experimental data set to be explained us- 
ing N factors (sites 1 and 2) with each factor having 2 
possible levels, 1 or —1 (analogous to the spin variable at 
each site). E consists of 2^ data points, with each rep- 
resented by a unique combination of N levels (subscripts 
of E). E is fully explained by a model consisting of 2^ 
effects, consisting of a constant, the N factors (single-site 
clusters) and all possible interactions between the factors 
constructed by multiplying the relevant factors (pair and 
multibody clusters). 



B. Physical meaning of J 



The coefficients, J, via the matrix inversion of H 



{1,2}: 



have specific physical meanings. 


Specifically, 




Jo = 


-{ Ell + Ell 


-|- £'11 + Ell ) 


(24) 


Ji = 


1 

-( Ell - Ell 


+ Ell - Ell ) 


(25) 


J2 = 


-{Ell - Ell 


+ Ell - Ell ) 


(26) 


Jl,2 = 


^{[Eii-E-ii]- 


[En -Ell]) ■ 


(27) 



Here Jq gives the average value of all 2^ levels; Ji gives 
the contrast of effect 1 (single-site cluster at site 1) aver- 
aged over all possible levels of effect 2 (single-site cluster 
at site 2); that is, En — Eii and Eii — En measure the 
effect of the changing the levels in effect 1 with effect 2 
fixed at levels 1 and —1, respectively. Likewise, J2 gives 
the contrast of effect 2 averaged over all possible levels in 
effect 1. As for the 2-body interaction Ji 2, the 1-body 
effects (given in square brackets) are contrasted with re- 
spect to each other. 

Thus, there is a clear physical meaning and basis for 
the interactions, J^. As we see next, the numerical values 
of the ECI depends on how the cluster functions (the 
basis set) are chosen for truncation. 



A. ECI via Full Factorial Design 



C. ECI via Fractional Factorial Design 



The full factorial design space is spanned by the 
columns of the Hadamard matrix ^{1^... !^}. Using N=2 



As noted in Section [llj only a fraction of 2^ possi- 
ble experimental data (this includes DFT structural en- 
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ergies) are obtained in practice, either because the ex- 
periments are costly or the total number required is 
prohibitively large . In FFD, the sparsity of effects (or 
Pareto's) principld^HH] jg assumed, i.e., all experiment 
data can be explained by a small number of effects. In 
a 2-level FFD, 1/2*^ (fc being an integer) of all possible 
experimental data are known. 

The FFD principle is useful for determining how the 
ECI in J are confounded; i.e., how interactions Ji and 
J2 in J (Eq. (|9|) are correlated with one another. [We 
use the accepted nomeclature "confounded", especially 
because it distinguishes basis-set truncation effects and 
actual physical correlations, e.g., atomic short-range or- 
der.] Two ECI are confounded if it is impossible to as- 
certain their individual values from the known data set. 
Specifically, if cjjn is a Hadamard matrix, each ECI in J2 
will be confounded with one and only one ECI in Ji. 



Using (23) as an example, suppose only the first two 



experiments. En and i^ii, are evaluated (half of the four 
possible experiments). This subset forms a combinatoric 
subspace where all possibilities of ${1} are included with 
<I>{2} held at a fixed value of 1. As a result 



1 1 

-1 1 



[Jo, Jl, J2, Jl,2]^ 



(28) 



which is an under-determined set of linear equations; 
hence, it is impossible to solve for all ECI. At best only 
two of the four can be determined. Now, note, because 
columns 1 and 3 of the (effect) matrix are identical, Jo 
and J2 are confounded, and likewise for Ji and Ji,2. We 
now choose to evaluate two ECI, include them in Ji and 
evaluate via (12). We avoid evaluating Jo and J2 or Ji 



and Ji,2, which would render (pn singular. 

Now, suppose we choose to evaluate Jq and Ji, com- 
paring ^ with (23), we have 



E 



[Ell, Ell] ; 
Jl = [Jo, Ji]^ ; 

011 = 012 = 021 



E2 

J2 = 



[Ell, Ell]' 

[J2, Jl,2]^ 
2=^(1} ' 



(29) 

(30) 
(31) 



where the last relation results from the property of 
Hadamard matrices, see (19). From Eq. (21), the er- 



ror analysis for the truncated case in Eqs. ( 13 )-( 15 ) are, 
respectively, simplified to 



Jl Jl + J2 
(5Ei = , 
SE2 = 2H{i} J2 



(32) 
(33) 
(34) 



From (32), the truncated set, Ji= [Jo + J2,Ji + Ji,2] 



is clearly a bias estimate of Ji. While Ei is reproduced 



exactly, sources of error ( 34 1 for E2 lies solely in J2 



To ensure that Ji predicts structural energies well, the 
ECI in J2 should be zero (or negligible); however, these 



values of J2 are not known a priori. A physics-based 
hierarchy is thus needed to rank the relative importance 
of the ECI. 



D. Hierarchical Order and Heredity Effect 

The crux of the issue is that one ECI from each of the 
two sets, {Jo, J2} and { Ji, Ji,2}, bas to be neglected, be- 
cause the two ECI in each set is confounded. In FFD, this 
choice is in ge neral made using the hierarchical ordering 
principlej'SSEZl i.e., higher-order interactions are smaller 
in magnitude and hence less important than lower-order 
ones. With the a priori assumption that [ Jo| > | J2I and 
[ Ji[ > I Ji,2|, one would choose to evaluate Jo ove r J2 and 
Jl over Ji_2. In addition, the effect hereditip^^ principle 
states that if a higher-order effect is important, then at 
least one of its parent effect is important. Thus, if we had 
instead evaluated Ji,2, both Ji and J2 (parent effects of 
Ji,2) must be evaluated. 

The concepts from FFD are necessarily applicable to 
CE. Because one only evaluates the DFT energies of a 
fraction of all 2^^ configurations, the ECI are confounded; 
and the confounding relations are affected by the choice 
of structures in Ei. When the structures form a com- 
plete configuration subspace, we show below that the 
confounding relation can be explained via geometry. A 
physical hierarchjPS is used to select the physically most 
important ECI from a set of confounded ECI; clusters 
with less number of sites and smaller spatial extent are 
physically more important. When the trCE basis is com- 
pact and locally complete the effect heredity principle is 
observed as well. Notably, such principles also refiect the 
underlying physical origin of the ECI in the CE, where a 
clear hierarchy of clusters exists,!^ as quantifiable from 
the moment theorem,'22H28].^^]^j(,jj jg ^j-^g fundamental basis 
for tight-binding (or Debye-Hiickel) and the generalized 
perturbation methods. 



V. FACTORIAL DESIGN AND ECI FROM 
CELLS WITH PERIODIC BOUNDARIES 

For the CE to represent correctly the thermodynam- 
ics of bulk alloys, the trCE has to be based on struc- 
tures (or configurations) on an infinitely repeating lat- 
tice (N 00). Hence, structural energies are calculated 
using periodic boundary conditions. Typically in CE, 
the clusters and configurations are classified according 
to the underlying symmetry of the lattice. The number 
of symmetry unique structures generated by an N-site 
lattice equals the number of symmetry-distinct clusters 
needed in the exact CE. For structures, only the sym- 
metry unique ones require evaluation via DFT, w here 
methods exist to distinguish symmetry unique ones.'^'^ 
When the clusters are classified according to symmetry 
(under the labels n and /), the CE in Eq. ([T]) can be 
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re-expressed as 



E(g) 
N 



(35) 



where n is the number of sites defining the cluster (e.g., 
n=2 for pairs) and / enumerates symmetry-distinct clus- 
ters with the same n but different spatial extenlPSl (e.g., 
for pairs, /=! for nearest neighbor (NN) and /=2 for 2"^^ 
NN) and there are Z?„/ degenerate clusters for each group 
(e.g., 1)21=12 and D22—Q for the FCC Bravais lattice). 

Clusters with the same label have the same ECI and 
the cluster function is averaged over all lattice sites, i.e.. 



nf/3 



N 



1 



1 



(36) 



where r?„/d is the set of lattice sites {ii, i„} of a degen- 
erate n-site cluster grouped under n, /. For a periodic 
structure, the site averaging is done in a finite-sized unit 
cell. Hence, for a complete space and assuming S symme- 
try distinct clusters, the correlation matrix (see ^ and 
(22)) is simplified into a 2'^-by-S matrix. 



E 



$11), ...,($1/, 



J' 



(37) 



where each column vector is an average of columns in the 
2'^-by- 2^ Hadamard matrix corresponding to the same 
cluster symmetry. Truncating the cluster function basis 
set inherently confound ECI, as discussed above. How- 
ever, when truncating in a finite-sized Hilbert space that 
is periodically repeated, the "confounding relations" for 
the ECI can be deduced from geometry, as we illustrate. 
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FIG. 1. (color online) FCC lattice viewed in 3-D (top) 
and along [0 1] (bottom right). The 2-site (4-sit6 cubic) 
supercell is given by translation vectors [1 1 0], [—1 1 0] and 
[0 2] ([2 0], [0 2 0] and [0 2]). Isolated 2-site cells are 
shown with dashed line (bottom left). The cell is periodically 
repeated to form an FCC lattice (bottom right). Some ECI 
(see text) are highlighted with bold lines for pairs (red), 3- 
body (orange) and 4-body (green). For clarity, selected pair 
ECI are highlighted on the top figure as well. 



each 2-atom supercell is given by 



A. Confounding Relations between ECI 



We illustrate the confounding relations between the 
ECI by considering, for simplicity, a 2-sitc FCC supercell 
defined by translation vectors [1 1 0], [—1 1 0], [0 2], see 
Fig. IT| The single-site cluster function, = cTj, at site 
i e {T 2} is 1 (—1) if occupied by B (A). A complete con- 
figuration space is formed if all 2^ states in the supercell 
are considered. 

We start by considering the 2-atom supercell to be iso- 
lated (infinitely separated from other supercells) . In this 
case, Eq. (p3|) constitutes the full Hilbert space of the 



isolated cell, where the conceivable interactions include 
only a constant, two single-site and one pair term, see 
Fig. [1] Eq. ( 23 ) can be re- written as 



E = Jo*{0} + 'h^m + ^2*{2} + ^1,2${1,2} 



(38) 



where (j] = {ii, ...,{„}) are defined in (19). When the 
supercells are assembled to form the FCC lattice, many 
interaction terms are possible (Fig. [I]) and the energy of 



E 



■{0} + Ji'^{i} 

{0} 

4^14 *{0} 



-J, 
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2^2^,2*J'{0} 



'{0} 



${2} + 8J'5^{1,2} 
16^l'l^{l,2} 



3^72^2 ${0} 



' 8jf,l,2*{2} 



- 8J'f 2.2*{l} 



-4^M,2,2'i> 



'{0} 



(39) 



The interaction superscripts of are the symmetry 
indices used for the CE, see Eq. (35), and the numerical 



prefactor gives the cluster degeneracy based on the sym- 
metry at each of the two atomic site. For clarity, we have 
limited the expansion to only the nearest neighbor (NN) 
muhibody ECI. 

The confounding relations between J'^f are apparent 
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when ( 39 ) is written as 



E = 



21 
1.2 



- 2 Jfi 

+ 4 J|,^2 + 



" ^'-^1,1,2,2 



" «J>-'2.2 



]*{0} 



•■•]<^{1} 
••■]^{2} 

■]*{1,2} 



(40) 



Based on a physical hierarchy, J^^ , Ji^ , Ji^ and J'f 2 are 
ECI of the most compact and important clusters and they 
are not confounded. Each of them is however confounded 
with other ECI of larger spatial extent and larger n. The 
confounding relations are conveniently revealed by anni- 
hilating repeated subscripts (which denotes atomic sites) 



nf 



{0}, if in = ir. 



(41) 



Interactions with the same irreducible subscripts will be 
confounded. For example, using (41) the subscripts of 
J'12 2 leads to 1 and, hence, the interaction is confounded 
with Ji". Note that J°^, J^^, J^^ and J^^^, the com- 
pact and physically most important ECI, have irreducible 
subscripts for the 2-atom supercell. 



B. Additional Symmetry Constraints 

As pointed out earlier, for a cluster expansion on a 
Bravais lattice, e.g., FCC, symmetry degenerate clusters 
are grouped together as they have the same interaction 
values. Hence, S^-^ — Jnf, V 77. For example, for point 
Jii, and, for pair ECI, = Ji^2 = 



ECI = J2" 
J'12 = J21- Enforcing the FCC symmetry on the 2-atom 
supercell generates only three unique structures (i.e., AA, 
AB and BB) and only three symmtery-unique ECI can 
be evaluated. Eq. (40) is then re-written as 



1$ 



{0} 



E = [Jo + 4J21 + 6 J22 + 8 J23... + 4 J41 

-f[Jn-^8J3i-f-. ..][${!} +${2}] 

-|-[8J21 + 16J23...]*{1.2} • (42) 

Thus, three distinct sets of confounded ECI exist, 
namely, those confounded with the 

1. constant Jq: {Jq, J21, J22, J23, J41, ■■■} 

2. point Jii: {Jn, J31, ...} 

3. P*-NNpair J21: {J21, J23,-} 

The ECI are listed according to physical hierarchy in each 
group. 

The Key Outcome - Selecting the most physically im- 
portant cluster from each confounded set, Jq, Jn and 
J21 constitute the truncated (physical) Ji, which are the 



3 independent ECI that can be determined within the 
present small Hilbert space. The neglected ECI bias the 



estimated Ji according to (13). Using the orthogonality 
of the Hadamard matrix, the estimators of the ECI can 



be evaluated accordingly from ( 42 ) 



Jn 



4 J21 = E • ${0} 
1 



( Ell + Ell + Ell + Ell ) 



Jn = E • ${1} = E • ${2} 
8 J21 = E • <i>{i,2} , 



(43) 
(44) 
(45) 



with expressions similar to those in (24 1 to (27). From 



(44) it can be shown that Efi ~ Eii, implying the pres- 
ence of only 3 unique structural energies, -En, Ejj and 
Ell. Hence, the symmetry of the problem is properly 
reflected. Notice, because the value of J21 is determined 



via (|45j), J21 is no longer confounded with Jq in (43 1. 

Lastly, when the energy is normalized with respect to 
the number of atoms {N = 2 in this case) , the symmetry- 
reduced CE formalism given in (35) is recovered, i.e.. 



E J ^ 1 1 

2 = yt^m] + •^ii[2^{i} + 2^{2}] 

2 ^ 4 ^ 3 ^ 

+ 12J2i[ — ${0} + Y^*{1,2}] + 6J22[-*{0}] 



24J 



231 



24J 



311 



24 
4 

24 



{0} 



{1} 



8 

24 

4 

24 



^{1,2}] + 
^{2}] + - 







+ 8J4l[^ 


"^{0}] 


Joi ($01 


) + J] 


+ 12 J21 ( 




+ 24J31 ( 


'^31) 



Jn ($11^ 



8J4l($41 



(46) 



where we have used Eq. (36). Notably, (^oi ) is a (con- 



stant) column vector of "I's". 



C. Physical Hierarchy of Clusters 

As discussed earlier, J have physical meaning and are 
the coefficients of a Hadamard- Walsh transformation. 
The importance of a cluster can be ranked according to 
the number of sites (order n) and spatial extent (range 
/). The need of hierarchical arrangement is clear, with- 
out which one could equally likely choose to evaluate 
>^22, "-^31 and J23, see (42), and still obtain a solution 
because these ECI are not confounded. From the mo- 



ment theorem ,'2SH2l! higher-order clusters are les s im por- 
tant (smaller in magnitude), as verified in DFT.'^'^ 

In addition, when an ECI of a higher-order cluster is 
included in Ji , all ECIs belonging to its subclusters must 
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also be included to give a locally complete CE setpS as 
reflected in the heredity principle in FFD. For clusters 
with the same n , tho se with larger spatial extent are 
less important,l2sH2MIIe.g., \J2i\ > IJ22I > IJ23I > ■.■ ■ 
Together, these mathematical/physical criteria permit a 
hierarchy of ranges for n-body EClf^^i.e., r(„) > ^(n+i); 
2-body ECI are longer range than 3-body, which are 
longer range than 4-body, and so on. Essentially, physical 
hierarchy22] requires that 

1. Higher-order (large n) clusters are less important 
(but, if an n-body cluster is included in the trCE, 
its subclusters must be included). 

2. For fixed n, clusters with larger spatial extent are 
less important. 

These rules maintain completeness within the local CE 
basis when mathematically implemented. 



D. Systematically Unconfounding Key ECI 

As is clear from above, for a finite-size supercell we can 
group confounded ECI together utilizing concepts from 
FFD. The ECI in each confounded group are arranged 
according to a physical hierarchy and the most physi- 
cally important ECI from each group is evaluated. The 
structures in the chosen supercell thus constitutes a con- 
figuration subspace, which is necessarily spanned by the 
cluster functions of the most physically important ECI. 
The key task is then to find a minimal subspace to achieve 
this, given that the number of ECI required for a general 
alloy is finite and follows the physical hierarchy. To this 
end we offer the following resolution: 

1. Define a large N-site supercell (with all possible 2^ 
configurations) as the "complete" Hilbert space. 

2. Select a reasonably-sized supercell in the Hilbert 



space as the initial subspace and Ei (see (12|) in- 



cludes DFT structural energies in this subspace. 

3. From the physical hierarchy for clusters, the most 
important unconfounded ECI are evaluated via 
pl. 



4. Augment the subspace to unconfound key, longer- 
ranged ECI (especially pairs). That is, check for 
physically important clusters whose ECI remain 
confounded and then unconfound each targeted 
ECI by adding a structure systematically - aug- 
menting - from the complete Hilbert space (not in 
the initial subspace) to Ei. 

The first step is a conceptual construct allowing us to 
define a large enough supercell as our complete space. 
When the ECI of all important cluster functions spanning 
this space is known (complete) , the CE is able to predict 
accurately all structural energies of the alloy system. A 
2-atom supercell shown earlier is unlikely to unconfound 



TABLE I. Model J„/ and their degeneracy Z)„/ for a FCC 
lattice. The estimate, J„/, via (12 1 is given for structures 



belonging to the subspace of a 2-site cell and the complete 
space given by the 4-site cubic cell. 









Jnf 


Jnf 


n 


/ 


Dnf 


Model 


2-site cell 


4-site cell 





1 


1 


1 


0.8 


1 


1 


1 


1 


-1 


-0.2 


-1 


2 


1 


12 


1 


1 


1 


3 


1 


24 


0.1 





0.1 


4 


1 


8 


-0.1 





-0.1 




:-i 1 0] 



[110] 

FIG. 2. (color online) 4-site supercells defining an FCC 
lattice (outlined in black) as viewed from the top. Fig. [l] 

/^^i and Jj 



ECI in Fig. Ill are shown also. The subscripts of j'^ ^^^^ 
are irreduciole, (41 1, so J31 and J41 are no longer confounded 
with Jii and Jo, respectively (see text). J23 and J21 are still 
confounded, as are J22 and Jq. 



key ECI of a binary, so steps 2 and 3 have to be accom- 
plished using a bigger supercell, as we now exemplify. 



1. An Illustrative Example 

For FCC binaries, we create a model CE Hamiltonian 
(values are in Table |l]) , such that all structural energies 
are defined by interactions within the nearest-neighbor 
(NN) range, i.e., only Joi, Jii, J21, 7^ 0. The 

model Hamiltonian is assumed to be unknown a pri- 
ori and we seek to estimate their values via (12 1. We 



start by using the configuration subspace defined by 
the 2-site supercell (Fig. [T]) to calculate an estimator, 

Ji = [Jqi, Jii, J2i\^ 1 whose components are the most 
physically important ECI that are not confounded, see 



(42). Because we did not span the complete space, Ji is 



biased because some non-zero ECI will be confounded. 



From (42), it is clear that Jpi will be confounded with 
J41 while Jii is confounded with J31. This is indeed the 



case as shown in Table |Tj e.g., 

ioi = ^01 + (£'41/4) J41 



(47) 
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J21 is not confounded with other non-zero ECI and is 
thus an unbiased estimate. 

We next consider the locahy complete subspace gen- 
erated by the cubic 4-site superceh, which contains all 
structures from the 2-site supercell and two new (A3B 
and AB3) structures. As illustrated in Fig. [2] J31 and 
J41 are no longer confounded with Jqi and Jn, respec- 
tively. This effectively unconfounds all non-zero ECL so 

Ji is an unbiased estimate of Ji, as shown in Table Ul 

Indeed, the confounding between ECI (due to the often 
arbitrary choices of cluster functions that are included) 
is the main cause of variation of ECI between different 
publications and different predictions, which can be now 
eliminated if we were lucky enough to choose a configura- 
tional subspace that is spanned by the cluster functions 
of all significant non-zero J„^. 

For real alloy systems, J2/ remains significant up to a 
longer range compared to multibody ECI (n > 2), hence 
it is necessary to unconfound the ECI of longer range 
pairs. However, when the supercell is increased as shown 
in going from the 2-site to the 4-site cell, one discovers 
that only the NN multibody J„i (n = 3, 4) are uncon- 
founded, longer-ranged pairs such as J22 and J23 remains 
confounded, see Fig. [2] caption. The increase in super- 
cell size unconfounds short-ranged multibodies at a faster 
rate than longer range pairs. To unconfound longer-range 
pairs while keeping the number of required (DFT cal- 
culated) structural energies very small, structures from 
an augmented space (step 4 in the above resolution) are 
added systematically to the existing subspace. Impor- 
tantly, one augmented structure is added at a time to 
unconfound a long range ECI. 

Generally, unconfounded (unique) truncated ECI are 
achieve by a limited augmentation of the initial configu- 
ration subspace, as discussed in Sec |VI| and illustrated for 
Ag-Au case study. We find that the truncated ECI from 
augmentation of the configuration subspace has compa- 
rable predictive capability as the one selected by CVi but 
with four times less structural energies. 



[0 2] I 



2 0] 



•1^ 



/ 



[2 0] 



(a) 



[0 2 0] 




7„ \ 



\ 



[] O ^ O [] D $ 



Q -- a — H — >y- B 6 

(b) 



[2 0] 



FIG. 3. (color online) (a) Schematic of a 32-Cubic FCC 
supercell forming a Hilbert space. Subspaces formed by 8- 
Rh (dashed) and 8-DO22 (dot-dashed) supercells are shown. 
Translation vectors are given in Table [III (^) Supercells viewed 
along [0 1], with corner (circle) and face-centered (square) 
sites marked. For convenience, the lattice constant (corre- 
sponding to 2""* NN) is given 2 units. For the 'complete' 
32-Cubic cell, all pairs up to J24 are unconfounded, so is J26. 
However, J25 is confounded with J21. For the 8-Rh-cel\, all 
pairs up to J22 are unconfounded, but J23 is confounded with 
J21. For the 8-DO22 subspace, pairs up to J23 are uncon- 
founded, but J24 is confounded either with Jo or J22. 



VI. AUGMENTED SUBSPACE-PROJECTION: 
FCC LATTICES 

We now exemplify the formalism for practical appli- 
cation, applied to FCC Ag-Au in Section [Vlll For FCC 
alloys, a cubic 32-atom supercell (Fig. [3| is selected as 
a 'complete' Hilbert space (denoted as 32-Cubic) with 
2^2 (Ri4.3 billion) configurations. If all important ECI 
are unconfounded, based on the aforementioned physical 
hierarchy, key clusters up to a size of n = 2^^ will be in- 
cluded. From the moment theorem,'^^'^ clusters beyond 
a certain order should have negligible ECI for metallic 
alloys; a properly trCE neglecting such terms will still 
predict well the energies. Given the CE basis set require- 
ments stated in Sec. |V C[ only a few multibodies ECI 
are significant generally (shown for Ag-Au in Sect. VII I. 



Thus, we can construct a CE using subspaces in the 32- 



Cubic space. Fig. [Sj to unconfound important multibody 
ECI for most alloys. 

Two 8-atom subspaces within the 'complete' 32-Cubic 
space are considered here; the 8-Rh and the 8-DO22 sub- 
spaces. Fig. [S] consisting of structures generated by 
a symmetric rhombohedral cell and a (less symmetric) 
rectangular cell, respectively. The translation vectors 
of these supercells are given in Table |lT[ The complete 
space of each 8-atom subspace consists of 2^ configura- 
tions. However, due to the underlying lattice symmetry 
and cell shape, there are only 16 and 27 unique struc- 
tures for the 8-Rh and 8-DO22 subspaces, respectively. 
These two subspaces overlap, with the groundstate struc- 
tures generated by the 4-Cubic space common to both. 
Each of the subspace contains the usual 'suspects' for 
FCC groundstate structures:^ LIq, A-rich LI2, B-rich 
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LI2, pure A and B. Low-energy configurations related 
to DO22 structure are only present in 8-D022- On top of 
multibodies beyond I''* NN, both the 8-Rh and 8-DO22 
spaces must necessarily unconfound Jqi, Jn, J21, J41 
because they both encompass the 4-Cubic space (see Sec. 
VD). For clarity, when we say an ECI is unconfounded, 



it implies that the ECI is unconfounded from lower-order 
(smaller n) and shorter-ranged ECI. 



A. Full Augmented Subspace-Projection: 
unconfounding longer-range pairs 

For an unique trCE, longer-ranged pairs are more crit- 
ical than shorter-ranged multibodies. Using the methods 
in Sec. V A the confounding relations for the subspaces 
can be worked out. However, as noted, the use of a larger 
supercell unconfounds multibody {n > 2) clusters faster 
than longer-ranged pairs (J2/ remains significant to a 
longer range than multibody J„/ with n > 2). As it 
turns out, see Fig. |3] in going from a 4-cubic cell to 8- 
atom cells, one only unconfounds pairs up to J22 for the 
8-Rh subspace and up to J23 for the 8-DO22 subspace. 
On inspection of the cell geometry, we observe that even 
the 'complete' 32-Cubic space unconfounds only up to 
the 4*'' NN pairs (J24) and the 6*'' NN pah (J26) while 
the 5*'* NN pair (J25) remains confounded, see Fig. [s] 
Structures from an augmented space must be added to 
unconfound J25 and those beyond 6*'' NN. 

Our augmentation approach allows greater flexibility 
than the original FED. Each targeted ECI is uncon- 
founded by adding one structural energy from an aug- 
menting space to El, so, notably, the number of ECI 

in Ji equals the number of structures in Ei. When the 
configuration space used is large enough to unconfound 
important ECI, the physical hierarchy ensures a uniquely 
trCE that approaches the exact one for the alloy. Collec- 
tively, the concepts discussed and illustrated here consti- 
tute our subspace-projection formalism. 

To unconfound J24, it suffices to combine 8-DO22 with 
non- overlapping configurations from 8-Rh. To uncon- 
found J25, a structure from an augmented space orthog- 
onal to the 32-Cubic space must be added. More struc- 
tures from an augmented space orthogonal to 32-Cubic 
are needed to unconfound longer-ranged pairs; just three 
more structures are required to produce an excellent CE 
for Ag-Au, see Sec. |VII| 



B. Practical Considerations 



are 
see 



When complete subspaces, e.g., 8-Rh and 8-DO22 
considered, the CE basis is locally complete and 
([9]) , is directly related to the Hadamard matrix, see ( 37 1 . 
The confounding relations with longer-ranged ECI (ex- 
ternal to the subspace) can be worked out geometrically, 
using concepts from FED. However, if only some of the 



TABLE II. Translation vectors of FCC supercells represent- 
ing the various spaces in Fig. [Sj with lattice constant a — 2. 
The number of sites and symmetry-unique structures gener- 
ated by each supercell are listed under Ng and Nc , respectively, 
with some example structures shown. The 4-Cubic space is 
a subspace of 8-Rh and 8-DO22, both of which form (over- 
lapping) subspaces within the 32-Cubic space. Nc was not 
evaluated exactly for the 32-Cubic, which covers a space of 
2^^ non-unique configurations. The confounding relations for 
n = 2 ECI up to B"* NN are shown too. Unless assigned the 
letter 'N' (not confounded), the ECI is confounded with other 
ECI of higher importance (of smaller n and shorter range are 
listed) in the particular subspace. 



Subspaces 


4-Cubic 


8-Rh 


8-DO22 


32-Cubic 


Ne 


4 


8 


8 


32 


Nc 


5 


16 


27 




Trans, vectors 


[2 0] 


[2 2 0] 


[2 0] 


[4 0] 




[0 2 0] 


[2 -2 0] 


[0 4 0] 


[0 4 0] 




[0 2] 


[0 2 2] 


[0 2] 


[0 4] 


Example 


Ag, Au, 


4-Cubic, 


4-Cubic, 


All 


Structures 


Llo, LI2 




DO22, ... 




Confounded? 










J21 


N 


N 


N 


N 


J22 


Jo 


N 


N 


N 


J23 


J21 


J21 


N 


N 


J24 


Jo 


Jo 


Jo, J22 


N 


J25 


J21 


J21 


J21 


J21 


J26 


Jo 


J22 


Jo, J22 


N 



structures in the subspace are used in structural inver- 
sion, may not be a Hadamard matrix and confound- 
ing relations between the ECI may not be determined 
simply from the geometry of the subspace. In such cases, 

we still construct Ji according to the physical hierarchy, 
but we check against confounding between the ECI by 
ensuring that (j>Ji(j>ii is determinate. 



VII. RESULTS 

The various CE results from subspace-projection for- 
malism in Sec. VI are showcased using structures from 
the 8-Rh, 8-DO22 and augmented subspaces. To distin- 
guish different sets of CE, we classify the structures in 
each CE set by the triplet {a, 6, c}; 'a' is the number of 
structures from 8-Rh, which includes all structures gener- 
ated by 4-Cubic space (see Table [Til), gives the number 
of additional structures from 8-DO22 not found in 8-Rh, 
and 'c' is the number of additional structures from the 
augmented space orthogonal to both 8-Rh and 8-DO22. 
The total number of structures used for direct structural 
inversion (SI) is a + & + c. 

For Ag-Au alloys, we use a database of 95 DFT 
structural formation energies (E^^""", from smallest first 
algorithrrP and the subspaces above) for construction, 
verification and comparison of various sets of trCE. Some 
of the structures generated by smallest first algorithms 
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TABLE III. Jnf (in meV) and their degeneracies for Ag-Au for different CE sets. For subspace projection, {a, b, c} 
represents the number of symmetry distinct structures (see text) from 8-Rh, 8-DO22 and augmented spaces, respectively. Left 
blank are ECI of clusters not used during SI, while ECI smaller than |0.005j meV are listed as ±0.00. Set {16, 0, 0} contains 
5 other ECI (with 5 < n < 8) that are not listed as they are smaller than |0.005| meV. Set {8, 4, 4}* uses the same structures 
as {8, 4, 4}, but unconfounds J44 instead of J27 (see text). For comparison, ECI selection via CVi score using 55-structure 
learning selP are also given. The rms (crms) and maximum (smax) deviation of E^^ with respect to Ej'^"'^ are shown with 
references to the figures of E/ vs. %Au. 









Jnf (meV) 


n 


/ 


Dnf 


{5, 0, 0} 


{16, 0, 0} 


{8, 0, 0} 


{8, 4 ,0} 


{8, 4, 1} 


{8, 4, 2} 


{8, 4, 4} 


{8, 4, 6} 
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{8, 4, 4}* 





1 


1 


-ouiu.oi 




QOOQ 9(S 


-ouuo. yo 


-ouuo. yo 


QOI 91 


Qm 91 


Qfii n ^^9 






1 


1 


1 


-238.13 


-237.90 


-237.89 


-237.89 


-237.89 


-237.89 


-237.35 


-237.25 


-237.23 


-237.35 


2 


1 

2 
3 
4 
5 

6 
7 
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12 

6 

24 
12 
24 

8 

48 
6 


7.65 


7.61 
-0.33 


7.58 
-0.35 


7.24 
-0.35 

0.17 
-0.05 


6.59 
-0.35 

0.17 
-0.05 

0.33 


6.79 
0.06 
0.17 
0.16 
0.22 
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6.79 
0.06 
0.36 
0.16 
0.22 
-0.31 
-0.09 


6.90 
0.06 
0.28 
0.16 
0.17 
-0.31 
-0.05 
0.10 


6.88 
0.10 
0.28 
0.15 
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2 
3 
4 
5 
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-0.02 
0.06 
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-0.18 
-0.00 
0.06 
-0.11 
-0.00 


-0.18 
-0.06 
0.06 
-0.05 


-0.18 
-0.02 
0.06 
-0.09 


4 


1 
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48 
48 
12 


-0.16 


-0.07 
0.02 
-0.02 
-0.03 


-0.16 
0.03 


-0.07 
0.03 
-0.02 


-0.07 
0.03 
-0.02 


-0.07 
0.03 
-0.02 


-0.07 
0.03 
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-0.07 
0.03 
-0.02 
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-0.16 
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4(e) 


4(f) 
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are not within the 32-Cubic space. The formation energy 
is defined as 



E^(a)=E(?)-c(a)E(Au)-(l-c(a))E(Ag) , (48) 

with c{(j) being the concentration of An in the given 
structure defined by a. The Ej ^""^ are estimated to be 
converged in the range of 0.2 meV, also setting the lower 
limit for precision. 

The quality of each CE set is evaluated by the root- 
mean-square (rms) deviation of E^^ with respect to 
E^^"^ for all 95 Ag-Au structures, which includes struc- 
tures not in the 32-Cubic space, i.e.. 



V^(Ef^(a. 



95 ^ 

i=l 



1/2 



(49) 



Via consideration of various subspaces, we show that the 
unique CE set obtained up to ~ 16 structural energies is 
sufficient to reproduce very well the 95 E^ (within the 
convergent errors of E^^"^), as compared to 55 structural 
energies used for CVi optimal fitting.- 



A. Subspace-Projection CE 

Figure |4] shows the Ej versus c (at.%Au) for various 
CE sets from subspace projection, with their ECI listed 
in Table III Starting with the CE set (5, 0, 0}, where the 



full 4-Cubic subspace is used, there are 5 unique struc- 
tures, which incidentally are groundstate structures for 
Ag-Au; hence, 5 ECI (up to the NN range) are used in 

Ji. From subspace projection, (pn, see (12), is a full- 
ranked matrix, so the E^^""" of the 5 structures are re- 
produced exactly by Ji. Although the Ej^""" in Fig. 



4(a) for structures on the groundstate hull are repro- 



duced well, some other structures are not distinguished 
due to use of a small set of ECI. Among ECI responsible 
for ordering (n> 2), J21 dominates. The dominance by 
lower-order, short-range clusters is explained by the mo- 
ment theorenPsHm and the electronic structure origins 
had been verified via direct DFT calculations .^^ 

Set (16, 0, 0} given in Fig. |4(b)| is constructed from 
the complete 8-Rh subspace, which unconfounds more 
multibody interactions; those with n > 5 are negligible 
(<0.005 meV) because they are much smaller than the 
convergent error of E^^""" data. These negligible multi- 
body improves the quality of the CE only marginally; the 
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ffrms for E/ is similar to the {5, 0, 0} despite having 11 
more ECI. As such, one can reduce computational cost 
from DFT calculations by using only a fraction of the 
structures in the subspace for Ei, see (12 1. 

We construct subset {8, 0, 0} with a fraction of the 
structures from {16, 0, 0} , retaining only ECI with sig- 
nificant magnitude (i.e., neglect ECI with n > 5) in Ji. 
Although this leads to 'internal' confounding between the 
original set of 16 ECI, this did not change the quality of 
the CE because the neglected ECI are small (negligible) . 
This results in only a slight increase in Erms (see Fig. 
4(c) I and a minimal change (~0.05 meV) in values of 
ECI (Table Im I. 



B. Subspace-Projection + Augmentation 

Although ECI with 2 < n < 4 are significant for set 
{16, 0, 0}, this could be a result of confounding with 
longer-ranged but important ECI. With the physical in- 
sight that lower-order ECI are longer range, the 8-Rh 
subspace does a poor job of unconfounding the longer- 
ranged pairs; {16, 0, 0} only unconfounds up to the 2nd 
NN pairs and triplets. We seek to include longer-ranged 
ECI (in particular, n=2) by including structures from 
other subspaces (augmentation). 

Four structures from 8-DO22 subspace are added to the 
set {8, 0, 0} and the resulting {8, 4, 0} set unconfounds 
J23, J24, J33 and J43, reducing the ffrms by ~0.2 meV. 
The E^'^ of LI2 and DO22 are now distinguished and 



reproduce the E^^''", see Fig. 
of high-energy structures are 



4(d) 

sHir 



However, the E 



DFT 

/ 



not reproduced well, 
but can be improved by including pairs beyond 4*'* NN. 

To further unconfound pairs, structures from an aug- 
mented space (see Table IV) are required, because com- 
bining 8-Rh and 8-DO22 (subspaces of 32-Cubic space) 
at best unconfounds J24. Two structures at 0.5 Au are 
added in turn to give sets {8, 4, 1} and {8, 4, 2}, shown 
in Figs. 4(e) and |4(f)[ unconfounding J25 and J26 leads 
to significant improvement in eims (by ^0.5 meV), which 
are now within the convergent errors of our Ej^"^ data. 
Unconfounding J27 and J34 with set {8, 4, 4} further 



5(b) 



reduces Eims to 0.42 meV, see Fig. 

Hence, with a subset of 12 structures from the Hilbert 
subspace, augmented by 4 structures (having cluster 
functions orthogonal to that subspace) to unconfound 
longer-ranged pairs, an excellent quality trCE for Ag-Au 
with key physical ECI is found using only 16 structures. 



VIII. DISCUSSION 

Below we discuss the relationship to and comparison 
with standard statistical fitting methods, and recently 
suggested regularization using Bayesian concepts. 



TABLE IV. Translation vectors of 16 FCC structures (prior 
to atomic relaxation) used in CE set (8, 4, 4} with their affil- 
iated subspaces given. The denominator in column 'Fraction 
An' gives the number of atomic sites in the unit cell of each 
structure. Structures SM#21, 27, 06 and 07 are from the 
augmented space _L to both 8-Rh and 8-DO22 subspaces, and 
except for SM#21 are also _L to the 32-Cubic space. 



Tag 


Fraction 




Affiliated 


Translation 




Au 




spaces 


vectors 


Ag 







AU 


[0 1 1], [1 1], [1 1 0] 


Au 


1 




AU 


[0 1 1], [1 1], [1 1 0] 


Llo 


1/2 


8- 


Rh, 8-DO22 


[1 1 0], [1 -1 0], [0 2] 


LI2 


1/4 


8-Rh, 8-DO22 


[2 0], [0 2 0], [0 2] 


LI2 


3/4 


8-Rh, 8-DO22 


[2 0], [0 2 0], [0 2] 


8-Rh#3 


3/8 




8-Rh 


[2 2 0], [2 -2 0], [0 2 2] 


8-Rh#7 


4/8 




8-Rh 


[2 2 0], [2 -2 0], [0 2 2] 


8-Rh#9 


5/8 




8-Rh 


[2 2 0], [2 -2 0], [0 2 2] 


DO22 


1/4 




8-DO22 


[2 0], [0 2 0], [1 1 2] 


DO22 


3/4 




8-DO22 


[2 0], [0 2 0], [1 1 2] 


SM#13 


2/4 




8-DO22 


[2 0], [0 2 0], [1 1 2] 


SM#24 


2/4 




8-DO22 


[4 0], [0 1 -1], [0 1 1] 


SM#21 


2/4 


Auf 


5., in 32-Cubic 


[1-10], [2 2 0], [0 2] 


SM#27 


2/4 


Auf 


_L 32-Cubic 


[3 3 2], [0 1 -1], [-1 1] 


SM#06 


1/3 


Auf 


_L 32-Cubic 


[1 1 0], [1 -1 0], [1 3] 


SM#07 


2/3 


Auf 


_L 32-Cubic 


[1 1 0], [1 -1 0], [1 3] 



A. Subspace-projection versus CVi Fitting 

We now compare the {8, 4, 4} subspace-projection 
trCE with the trCE obtained by minimizing CVi , - which 
uses at least 55 structures (not necessarily from the 
32-Cubic space) as the learning set. Unlike subspace- 
projection which used 16 structures for direct SI, CVi 
selects a set via a statistical fit and is allowed to have 
fewer ECI than the number of DFT energies used for SI. 
We emphasize that our CVi selection also uses the same 
hierarchy of cluster^^H ^ ensure a locally complete CE, 
unlike others. The small improvement of e^nis by 0.1 
meV for the optimal CVi set is a result of using more 
than 3 times the structures in the learning set; that is, 
the least-squares error is minimized in ( 12 ) over 55 struc- 



tures, which is a large fraction of the 95 structures used 
for validation by Erms m (49). So, it is not surprising 
that there is a slight improvement using CVi, because 
the ECI values are altered to improve the fit. 

To facilitate comparison of the ECIs, we further con- 
struct set {8, 4, 6}, unconfounding J28 and J35. The 
improvement in e^ms is insignificant versus {8, 4, 4}. As 
the ECI of {8, 4, 4}, {8, 4, 6} 



observed in Table HI 



and CVi (55-structure) fit are very close to one another, 
signifying a convergence in ECI, within errors of Ej^""". 

We see that the selection of ECI based on a physi- 
cal hierarchy is of primary importance, because once the 
physically important ECI are unconfounded, the exact 
CE of the alloy system is approached. At this point, the 
ECI and the accuracy of the trCE are similar regardless 
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FIG. 4. (color online) E^"^ (meV) versus %Au from CE sets (diamonds) using subspace projection from (a) {5, 0, 0}, (b) 
{16, 0, 0}, (c) {8, 0, 0}, (d) {8, 4, 0}, (e) {8, 4, 1} and (f) {8, 4, 2}, with E^^"^ ('+') for all 95 Ag-Au structures. Structures 
[(red) squares] used for structural inversion (SI) are marked. The rms (£rms) and maximum (smax) deviation of E^^ from E^^""" 
are given for each CE set. Only the 8-Rh subspace (including Llo and LI2) are used in (a) to (c), which unconfounds up to 
2"'^-NN pair at most, and they have similar Erms- (d) Adding 4 structures from 8-DO22 cell unconfounds the 4*'*-NN pair and 
gives significantly better Erms, although high-energy structures are less well reproduced; these energies can only be improved by 
unconfounding the 5"*- and 6"'-NN pairs, (e) and (f), respectively, using (up to) 2 new structures from an augmented space. 



of the number of structural energies used in the learn- 
ing set. For example, in {8, 4, 4} subspace, without the 
physical hierarchy, one could have unconfounded J44 in- 
stead of the physically more important J27 ({8, 4, 4}* 
versus {8, 4, 4}, respectively, in Table |lll]), producing a 
CE with worse predictive capability, see Fig. [6j The bot- 
tom line: ECI have physical meanings and they should 



not be treated merely as coefficients obtained from sta- 
tistical fitting. 
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FIG. 5. (color online) E^'= (meV) versus %Au (diamonds) 
using (a) CE selected via CVi using 55 structures (squares) 
and (b) CE from the {8, 4, 4} subspace-projection with 16 
structures (squares). E^^""" are denoted by '-f '. 



B. Relation to Bayesian Approaches 

The physical hierarchy of clusters utiUzed in the 
present paper for unconfounding (also used in our previ- 
ous CVi CEP^ would modify the usually assumed "uni- 
form" (i.e., otherwise uninformative) prior distribution 
for the ECl, J, within the Bayesian framework. The pos- 
terior probability of J given Ei iJISMl 



P{J\Ei)^P{Ei\J)P"{J) 



(50) 



where J — [Ji, J2]'^ ■ J contains the truncated (non-zero) 
Ji from SI and excluded (possibly zero) J2. Here, P'^{J) 
is the prior distribution, which is non-zero only for trCE 
whose ECls are locally complete and follow the physical 
hierarchy in our subspace-projection CE. In contrast, a 
uniformly distributed P°(J) assumes all trCE are possi- 
ble, regardless of being physical or not. 

Recently, by an assumption that the ECl of a given 
cluster results from a large number of a priori ran- 
dom contributions, a Gaussian prior distribution was 
proposed^^ ; additionally, a decaying weight was assumed 
to cutoff smoothly contributions from ECl that otherwise 
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FIG. 6. (color online) E^'= (meV) versus %Au (diamonds) 
using CE set {8, 4, 4}* where J44 is added without observing 
the physical cluster hierarchy. E?^""^ are denoted by '-I-'. 



are assumed zero. To be clear, our confounding relations, 
see, e.g., (42), reflect mathematically the specific ECl in 
J2 (albeit with a priori unknown values) that directly 



affect those in Ji. Our a priori choice can be to set all 
J2 to zero and validate using structural energies not in 
the learning set. {A posteriori we can augment the sub- 
space to systematically unconfound.) Or we can assume 
that P^{J) decay s according to some specifically chosen 
distributionjli^l^fil -^^hich certainly may be included in the 
present formalism. 



IX. CONCLUSION 

To construct a unique truncated CE, we presented 
an Augmented Subspace-Projection formalism using the 
mathematics of Hilbert spaces and concepts from Frac- 
tional Factorial Design (FED) that directly select the 
critical, a priori unknown ECl in the included set of Ji 
(with excluded ones in J2). As exemplified for binary 
alloys with an N-site lattice and Hilbert space of 2^^ con- 
figurations, structural energies can be reproduced by an 

estimator Ji, containing a minimal set of physical ECl 
that approaches the exact ECl. When N is large, DFT 
calculations are feasible only for a vanishingly small frac- 
tion of the 2^ structures, resulting in linear dependencies 

between basis functions such that Ji is confounded with 

specific ECl in J2- However, from FED concepts, this 
confounding between ECl can be determined, so only a 
few (^16) structures are needed to construct, without 



fitting, a reliable CE with quantifiable errors, see (32)- 
(34). Of course, no statistical fitting does not imply no 



statistical validation. 

For practical applications using structures with peri- 
odic boundary conditions, we showed that the confound- 
ing relations between ECl can be identified geometrically 
when subspaces (chosen supercells that lie within the de- 
fined Hilbert space) are considered. A physical hierar- 



chy of ECI provides a condition to obtain a physical set 
of truncated ECI. Although the CE from the subspaces 
are complete, longer-ranged pairs can remain confounded 
with truncated ECI, which can be unconfounded by aug- 
menting the subspace. 

Using FCC Ag-Au as a case study, we defined an initial 
subspace by an 8-atom rhombohcdral cell (8 structures), 
which is then augmented to construct a unique truncated 
CE. This augmented subspace-projection formalism using 
16 structures, without fitting, produces a CE with similar 
predictive capability as that obtained from a CVi statis- 
tical fit using >55 structures. The concepts discussed 
above can be generalized to multicomponent alloys. 
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We have used the fact that << E(i7) >>=< E{a) > and 
< E((t) >= E((t). For the variance term, 

Var=/(E(a)-(E(a)))'\ 




where is the variance of the randomly distributed error 
e (see Q) . For the bias term, 

Bias=^((E(a))-E(a))'^ 

= ((E(a))-E(a))' 

= 41^1-4,2 ■/2)' . (A4) 



Appendix A: Derivation of Error Terms 

We derive the decomposition of the MSE into variance 
and bias terms shown in Section fll Al 

MSE = </(e(ct) -E(ct))^^ 

= ^(E(a) - (E{a)) + (E(a)) - E(a))'^ 

= ^(e(<7) - (E(a)))') + (((eI'?)) - E(a))'^ 

+ 2((E(a)-(E(a))) ((E((T))-E(a))) 
= Var -I- Bias -I- 2 Cross . (Al) 

where < ... > denotes expectation values averaged over 
all possible observations having the same atomic config- 
uration, (T. 



Cross = ^E(ct) {|e(ct)^^ - ^E(a)E(?)^ 

-((E(a))(E(<7))) + ((E(a))E(<7)) 
^(E(<7))(E(a))-(E(a))E(a) 
-(E(a))(E(?)) + (E(a))E(a) = . (A2) 
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